Dataset statistics
| Number of variables | 20 |
|---|---|
| Number of observations | 648322 |
| Missing cells | 309786 |
| Missing cells (%) | 2.4% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 98.9 MiB |
| Average record size in memory | 160.0 B |
Variable types
| Categorical | 8 |
|---|---|
| Numeric | 12 |
reportingmunicipalityid has constant value "1061" | Constant |
egid has a high cardinality: 7701 distinct values | High cardinality |
statyear is highly correlated with personpseudoid and 1 other fields | High correlation |
personpseudoid is highly correlated with statyear and 1 other fields | High correlation |
ageclass is highly correlated with maritalstatusclass and 2 other fields | High correlation |
maritalstatusclass is highly correlated with ageclass | High correlation |
arrivalyearmunicipality is highly correlated with ageclass and 1 other fields | High correlation |
gastws is highly correlated with gkats and 3 other fields | High correlation |
gazwot is highly correlated with gastws and 2 other fields | High correlation |
ewid is highly correlated with gastws and 2 other fields | High correlation |
householdid is highly correlated with statyear and 1 other fields | High correlation |
arrivalyearswitzerland is highly correlated with ageclass and 2 other fields | High correlation |
wareaclass is highly correlated with wazimclass | High correlation |
gkats is highly correlated with gastws and 1 other fields | High correlation |
wazimclass is highly correlated with gkats and 1 other fields | High correlation |
reportingmunicipalityid is highly correlated with wareaclass and 5 other fields | High correlation |
nationalityclass is highly correlated with arrivalyearswitzerland | High correlation |
populationtype is highly correlated with reportingmunicipalityid | High correlation |
gbaups is highly correlated with nh | High correlation |
eh is highly correlated with nh | High correlation |
nh is highly correlated with gbaups and 4 other fields | High correlation |
arrivalyearswitzerland has 309786 (47.8%) missing values | Missing |
personpseudoid has unique values | Unique |
ageclass has 6689 (1.0%) zeros | Zeros |
Reproduction
| Analysis started | 2022-09-29 07:23:24.386812 |
|---|---|
| Analysis finished | 2022-09-29 07:24:30.807671 |
| Duration | 1 minute and 6.42 seconds |
| Software version | pandas-profiling v3.3.0 |
| Download configuration | config.json |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 4.9 MiB |
| 1061 |
|---|
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Characters and Unicode
| Total characters | 2593288 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1061 |
|---|---|
| 2nd row | 1061 |
| 3rd row | 1061 |
| 4th row | 1061 |
| 5th row | 1061 |
Common Values
| Value | Count | Frequency (%) |
| 1061 | 648322 |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 1061 | 648322 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 1296644 | |
| 0 | 648322 | |
| 6 | 648322 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 2593288 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 1296644 | |
| 0 | 648322 | |
| 6 | 648322 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 2593288 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 1296644 | |
| 0 | 648322 | |
| 6 | 648322 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2593288 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 1296644 | |
| 0 | 648322 | |
| 6 | 648322 |
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2016.394782 |
| Minimum | 2012 |
|---|---|
| Maximum | 2020 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.9 MiB |
Quantile statistics
| Minimum | 2012 |
|---|---|
| 5-th percentile | 2012 |
| Q1 | 2015 |
| median | 2017 |
| Q3 | 2019 |
| 95-th percentile | 2020 |
| Maximum | 2020 |
| Range | 8 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 2.491499883 |
|---|---|
| Coefficient of variation (CV) | 0.001235621072 |
| Kurtosis | -0.9710227481 |
| Mean | 2016.394782 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | -0.2411116064 |
| Sum | 1307273098 |
| Variance | 6.207571665 |
| Monotonicity | Increasing |
| Value | Count | Frequency (%) |
| 2020 | 82090 | |
| 2019 | 81588 | |
| 2015 | 81329 | |
| 2014 | 81111 | |
| 2018 | 81072 | |
| 2016 | 81023 | |
| 2017 | 80933 | |
| 2012 | 79176 |
| Value | Count | Frequency (%) |
| 2012 | 79176 | |
| 2014 | 81111 | |
| 2015 | 81329 | |
| 2016 | 81023 | |
| 2017 | 80933 | |
| 2018 | 81072 | |
| 2019 | 81588 | |
| 2020 | 82090 |
| Value | Count | Frequency (%) |
| 2020 | 82090 | |
| 2019 | 81588 | |
| 2018 | 81072 | |
| 2017 | 80933 | |
| 2016 | 81023 | |
| 2015 | 81329 | |
| 2014 | 81111 | |
| 2012 | 79176 |
| Distinct | 648322 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.016794925 × 1014 |
| Minimum | 2.012400076 × 1014 |
|---|---|
| Maximum | 2.020400135 × 1014 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.9 MiB |
Quantile statistics
| Minimum | 2.012400076 × 1014 |
|---|---|
| 5-th percentile | 2.012400111 × 1014 |
| Q1 | 2.01540017 × 1014 |
| median | 2.017400031 × 1014 |
| Q3 | 2.019400144 × 1014 |
| 95-th percentile | 2.020400095 × 1014 |
| Maximum | 2.020400135 × 1014 |
| Range | 8.000059248 × 1011 |
| Interquartile range (IQR) | 3.99997441 × 1011 |
Descriptive statistics
| Standard deviation | 2.491481859 × 1011 |
|---|---|
| Coefficient of variation (CV) | 0.001235366982 |
| Kurtosis | -0.9709787101 |
| Mean | 2.016794925 × 1014 |
| Median Absolute Deviation (MAD) | 1.999862345 × 1011 |
| Skewness | -0.2411306764 |
| Sum | 1.626043436 × 1018 |
| Variance | 6.207481855 × 1022 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2.012400153 × 1014 | 1 | < 0.1% |
| 2.01840005 × 1014 | 1 | < 0.1% |
| 2.018400083 × 1014 | 1 | < 0.1% |
| 2.018400071 × 1014 | 1 | < 0.1% |
| 2.018400109 × 1014 | 1 | < 0.1% |
| 2.018400136 × 1014 | 1 | < 0.1% |
| 2.018400064 × 1014 | 1 | < 0.1% |
| 2.01840011 × 1014 | 1 | < 0.1% |
| 2.018400049 × 1014 | 1 | < 0.1% |
| 2.018400078 × 1014 | 1 | < 0.1% |
| Other values (648312) | 648312 |
| Value | Count | Frequency (%) |
| 2.012400076 × 1014 | 1 | |
| 2.012400076 × 1014 | 1 | |
| 2.012400076 × 1014 | 1 | |
| 2.012400076 × 1014 | 1 | |
| 2.012400076 × 1014 | 1 | |
| 2.012400076 × 1014 | 1 | |
| 2.012400076 × 1014 | 1 | |
| 2.012400076 × 1014 | 1 | |
| 2.012400076 × 1014 | 1 | |
| 2.012400076 × 1014 | 1 |
| Value | Count | Frequency (%) |
| 2.020400135 × 1014 | 1 | |
| 2.020400135 × 1014 | 1 | |
| 2.020400135 × 1014 | 1 | |
| 2.020400135 × 1014 | 1 | |
| 2.020400135 × 1014 | 1 | |
| 2.020400135 × 1014 | 1 | |
| 2.020400135 × 1014 | 1 | |
| 2.020400135 × 1014 | 1 | |
| 2.020400135 × 1014 | 1 | |
| 2.020400135 × 1014 | 1 |
| Distinct | 30 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 39.91999346 |
| Minimum | 0 |
|---|---|
| Maximum | 105 |
| Zeros | 6689 |
| Zeros (%) | 1.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.9 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 25 |
| median | 40 |
| Q3 | 55 |
| 95-th percentile | 80 |
| Maximum | 105 |
| Range | 105 |
| Interquartile range (IQR) | 30 |
Descriptive statistics
| Standard deviation | 21.97714858 |
|---|---|
| Coefficient of variation (CV) | 0.5505298642 |
| Kurtosis | -0.7608938002 |
| Mean | 39.91999346 |
| Median Absolute Deviation (MAD) | 15 |
| Skewness | 0.2134019693 |
| Sum | 25881010 |
| Variance | 482.9950596 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 25 | 64641 | 10.0% |
| 30 | 62707 | 9.7% |
| 35 | 51958 | 8.0% |
| 50 | 44763 | 6.9% |
| 40 | 43798 | 6.8% |
| 45 | 43163 | 6.7% |
| 20 | 42044 | 6.5% |
| 55 | 40684 | 6.3% |
| 60 | 34933 | 5.4% |
| 65 | 31031 | 4.8% |
| Other values (20) | 188600 |
| Value | Count | Frequency (%) |
| 0 | 6689 | |
| 1 | 6347 | |
| 2 | 5908 | |
| 3 | 5610 | |
| 4 | 5358 | |
| 5 | 5222 | |
| 6 | 5115 | |
| 7 | 4992 | |
| 8 | 4883 | |
| 9 | 4818 |
| Value | Count | Frequency (%) |
| 105 | 3 | < 0.1% |
| 100 | 63 | < 0.1% |
| 95 | 801 | 0.1% |
| 90 | 4120 | 0.6% |
| 85 | 11261 | 1.7% |
| 80 | 18486 | |
| 75 | 23493 | |
| 70 | 28031 | |
| 65 | 31031 | |
| 60 | 34933 |
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 4.9 MiB |
| Switzerland | |
|---|---|
| Central Europe | 44792 |
| Other | 42671 |
| Southern Europe | 36095 |
| Southeastern Europe | 18834 |
| Other values (3) | 10917 |
Length
| Max length | 19 |
|---|---|
| Median length | 11 |
| Mean length | 11.32123852 |
| Min length | 5 |
Characters and Unicode
| Total characters | 7339808 |
|---|---|
| Distinct characters | 22 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Switzerland |
|---|---|
| 2nd row | Central Europe |
| 3rd row | Southeastern Europe |
| 4th row | Southeastern Europe |
| 5th row | Southeastern Europe |
Common Values
| Value | Count | Frequency (%) |
| Switzerland | 495013 | |
| Central Europe | 44792 | 6.9% |
| Other | 42671 | 6.6% |
| Southern Europe | 36095 | 5.6% |
| Southeastern Europe | 18834 | 2.9% |
| Western Europe | 7039 | 1.1% |
| Northern Europe | 2113 | 0.3% |
| Eastern Europe | 1765 | 0.3% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| switzerland | 495013 | |
| europe | 110638 | 14.6% |
| central | 44792 | 5.9% |
| other | 42671 | 5.6% |
| southern | 36095 | 4.8% |
| southeastern | 18834 | 2.5% |
| western | 7039 | 0.9% |
| northern | 2113 | 0.3% |
| eastern | 1765 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 784833 | |
| r | 761073 | |
| t | 667156 | |
| n | 605651 | |
| a | 560404 | |
| S | 549942 | 7.5% |
| l | 539805 | 7.4% |
| z | 495013 | 6.7% |
| i | 495013 | 6.7% |
| d | 495013 | 6.7% |
| Other values (12) | 1385905 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 6470210 | |
| Uppercase Letter | 758960 | 10.3% |
| Space Separator | 110638 | 1.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 784833 | |
| r | 761073 | |
| t | 667156 | |
| n | 605651 | |
| a | 560404 | |
| l | 539805 | |
| z | 495013 | |
| i | 495013 | |
| d | 495013 | |
| w | 495013 | |
| Other values (5) | 571236 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 549942 | |
| E | 112403 | 14.8% |
| C | 44792 | 5.9% |
| O | 42671 | 5.6% |
| W | 7039 | 0.9% |
| N | 2113 | 0.3% |
Space Separator
| Value | Count | Frequency (%) |
| 110638 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 7229170 | |
| Common | 110638 | 1.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 784833 | |
| r | 761073 | |
| t | 667156 | |
| n | 605651 | |
| a | 560404 | |
| S | 549942 | |
| l | 539805 | |
| z | 495013 | 6.8% |
| i | 495013 | 6.8% |
| d | 495013 | 6.8% |
| Other values (11) | 1275267 |
Common
| Value | Count | Frequency (%) |
| 110638 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 7339808 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 784833 | |
| r | 761073 | |
| t | 667156 | |
| n | 605651 | |
| a | 560404 | |
| S | 549942 | 7.5% |
| l | 539805 | 7.4% |
| z | 495013 | 6.7% |
| i | 495013 | 6.7% |
| d | 495013 | 6.7% |
| Other values (12) | 1385905 |
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 4.9 MiB |
| 1 | |
|---|---|
| 3 | 16464 |
| 2 | 3817 |
| 4 | 23 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 648322 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 628018 | |
| 3 | 16464 | 2.5% |
| 2 | 3817 | 0.6% |
| 4 | 23 | < 0.1% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 1 | 628018 | |
| 3 | 16464 | 2.5% |
| 2 | 3817 | 0.6% |
| 4 | 23 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 628018 | |
| 3 | 16464 | 2.5% |
| 2 | 3817 | 0.6% |
| 4 | 23 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 648322 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 628018 | |
| 3 | 16464 | 2.5% |
| 2 | 3817 | 0.6% |
| 4 | 23 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 648322 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 628018 | |
| 3 | 16464 | 2.5% |
| 2 | 3817 | 0.6% |
| 4 | 23 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 648322 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 628018 | |
| 3 | 16464 | 2.5% |
| 2 | 3817 | 0.6% |
| 4 | 23 | < 0.1% |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 4.9 MiB |
| 1 | |
|---|---|
| 2 | |
| 4 | |
| 3 | 29150 |
| -9 | 30 |
Length
| Max length | 2 |
|---|---|
| Median length | 1 |
| Mean length | 1.000046273 |
| Min length | 1 |
Characters and Unicode
| Total characters | 648352 |
|---|---|
| Distinct characters | 6 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 2 |
| 3rd row | 2 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 331607 | |
| 2 | 233250 | |
| 4 | 54285 | 8.4% |
| 3 | 29150 | 4.5% |
| -9 | 30 | < 0.1% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 1 | 331607 | |
| 2 | 233250 | |
| 4 | 54285 | 8.4% |
| 3 | 29150 | 4.5% |
| 9 | 30 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 331607 | |
| 2 | 233250 | |
| 4 | 54285 | 8.4% |
| 3 | 29150 | 4.5% |
| - | 30 | < 0.1% |
| 9 | 30 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 648322 | |
| Dash Punctuation | 30 | < 0.1% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 331607 | |
| 2 | 233250 | |
| 4 | 54285 | 8.4% |
| 3 | 29150 | 4.5% |
| 9 | 30 | < 0.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 30 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 648352 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 331607 | |
| 2 | 233250 | |
| 4 | 54285 | 8.4% |
| 3 | 29150 | 4.5% |
| - | 30 | < 0.1% |
| 9 | 30 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 648352 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 331607 | |
| 2 | 233250 | |
| 4 | 54285 | 8.4% |
| 3 | 29150 | 4.5% |
| - | 30 | < 0.1% |
| 9 | 30 | < 0.1% |
| Distinct | 101 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3384.576534 |
| Minimum | 1922 |
|---|---|
| Maximum | 9999 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.9 MiB |
Quantile statistics
| Minimum | 1922 |
|---|---|
| 5-th percentile | 1967 |
| Q1 | 1997 |
| median | 2010 |
| Q3 | 2017 |
| 95-th percentile | 9997 |
| Maximum | 9999 |
| Range | 8077 |
| Interquartile range (IQR) | 20 |
Descriptive statistics
| Standard deviation | 3024.973862 |
|---|---|
| Coefficient of variation (CV) | 0.8937525362 |
| Kurtosis | 0.9876010888 |
| Mean | 3384.576534 |
| Median Absolute Deviation (MAD) | 9 |
| Skewness | 1.728417163 |
| Sum | 2194295428 |
| Variance | 9150466.863 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 9997 | 112147 | 17.3% |
| 2012 | 28045 | 4.3% |
| 2014 | 26309 | 4.1% |
| 2015 | 24163 | 3.7% |
| 2013 | 22421 | 3.5% |
| 2011 | 22250 | 3.4% |
| 2016 | 20976 | 3.2% |
| 2010 | 19274 | 3.0% |
| 2009 | 19266 | 3.0% |
| 2008 | 18516 | 2.9% |
| Other values (91) | 334955 |
| Value | Count | Frequency (%) |
| 1922 | 1 | < 0.1% |
| 1923 | 1 | < 0.1% |
| 1924 | 1 | < 0.1% |
| 1925 | 2 | < 0.1% |
| 1926 | 7 | < 0.1% |
| 1927 | 16 | |
| 1928 | 29 | |
| 1929 | 16 | |
| 1930 | 15 | < 0.1% |
| 1931 | 38 |
| Value | Count | Frequency (%) |
| 9999 | 49 | < 0.1% |
| 9997 | 112147 | |
| 2020 | 6576 | 1.0% |
| 2019 | 12515 | 1.9% |
| 2018 | 15700 | 2.4% |
| 2017 | 18233 | 2.8% |
| 2016 | 20976 | 3.2% |
| 2015 | 24163 | 3.7% |
| 2014 | 26309 | 4.1% |
| 2013 | 22421 | 3.5% |
| Distinct | 84 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 309786 |
| Missing (%) | 47.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1043.882893 |
| Minimum | -5 |
|---|---|
| Maximum | 2020 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 161570 |
| Negative (%) | 24.9% |
| Memory size | 4.9 MiB |
Quantile statistics
| Minimum | -5 |
|---|---|
| 5-th percentile | -5 |
| Q1 | -5 |
| median | 1968 |
| Q3 | 2007 |
| 95-th percentile | 2016 |
| Maximum | 2020 |
| Range | 2025 |
| Interquartile range (IQR) | 2012 |
Descriptive statistics
| Standard deviation | 1002.27575 |
|---|---|
| Coefficient of variation (CV) | 0.9601419435 |
| Kurtosis | -1.991344207 |
| Mean | 1043.882893 |
| Median Absolute Deviation (MAD) | 50 |
| Skewness | -0.09071918136 |
| Sum | 353391939 |
| Variance | 1004556.678 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -5 | 161570 | |
| 2012 | 8931 | 1.4% |
| 2014 | 8645 | 1.3% |
| 2015 | 8406 | 1.3% |
| 2011 | 7990 | 1.2% |
| 2013 | 7952 | 1.2% |
| 2008 | 7069 | 1.1% |
| 2010 | 6369 | 1.0% |
| 2016 | 6352 | 1.0% |
| 2007 | 6313 | 1.0% |
| Other values (74) | 108939 | 16.8% |
| (Missing) | 309786 |
| Value | Count | Frequency (%) |
| -5 | 161570 | |
| 1931 | 8 | < 0.1% |
| 1932 | 2 | < 0.1% |
| 1938 | 17 | < 0.1% |
| 1939 | 4 | < 0.1% |
| 1940 | 6 | < 0.1% |
| 1941 | 3 | < 0.1% |
| 1942 | 2 | < 0.1% |
| 1944 | 8 | < 0.1% |
| 1946 | 5 | < 0.1% |
| Value | Count | Frequency (%) |
| 2020 | 1401 | 0.2% |
| 2019 | 3079 | 0.5% |
| 2018 | 4034 | |
| 2017 | 5052 | |
| 2016 | 6352 | |
| 2015 | 8406 | |
| 2014 | 8645 | |
| 2013 | 7952 | |
| 2012 | 8931 | |
| 2011 | 7990 |
| Distinct | 7701 |
|---|---|
| Distinct (%) | 1.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 4.9 MiB |
| 5xpzYwWo6Vs4qUHpPbcHCyb2bXNzuV1ke6eTHb6RVuk= | 1760 |
|---|---|
| msXpQv+xlQ4cpSL1lBciTtiudZN0nTyKSPNK/a2Cpgo= | 1745 |
| spwam3nBr088N0FSc+5UQQXld9T4LVTtcDF0/2epOVI= | 1063 |
| qBiDXzkp79Jv7UoMWT5atdmDoCRdeYkyB9r52VTfchI= | 961 |
| 4o/uHzHZVybdJFOuehm+EG0q/AUC2buECmFsDJLYUqY= | 885 |
| Other values (7696) |
Length
| Max length | 44 |
|---|---|
| Median length | 44 |
| Mean length | 44 |
| Min length | 44 |
Characters and Unicode
| Total characters | 28526168 |
|---|---|
| Distinct characters | 65 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 26 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | M9+DpAhjwnQRpDeeMxtGpPzNcu3BmCRYu+f6PPDpcFw= |
|---|---|
| 2nd row | M9+DpAhjwnQRpDeeMxtGpPzNcu3BmCRYu+f6PPDpcFw= |
| 3rd row | 5ksXQiLBDK0QW/BfKkShbFkU6VpevtuxJnkCr9PImAc= |
| 4th row | 5ksXQiLBDK0QW/BfKkShbFkU6VpevtuxJnkCr9PImAc= |
| 5th row | 5ksXQiLBDK0QW/BfKkShbFkU6VpevtuxJnkCr9PImAc= |
Common Values
| Value | Count | Frequency (%) |
| 5xpzYwWo6Vs4qUHpPbcHCyb2bXNzuV1ke6eTHb6RVuk= | 1760 | 0.3% |
| msXpQv+xlQ4cpSL1lBciTtiudZN0nTyKSPNK/a2Cpgo= | 1745 | 0.3% |
| spwam3nBr088N0FSc+5UQQXld9T4LVTtcDF0/2epOVI= | 1063 | 0.2% |
| qBiDXzkp79Jv7UoMWT5atdmDoCRdeYkyB9r52VTfchI= | 961 | 0.1% |
| 4o/uHzHZVybdJFOuehm+EG0q/AUC2buECmFsDJLYUqY= | 885 | 0.1% |
| xTw2TUMDOCWvBlZFKTb8Py6pLTVY078AvB9juXAYc4o= | 851 | 0.1% |
| tlqIJPBcflgWlvFrG3d4Le4Tx180MdDZu35xB8qbXko= | 803 | 0.1% |
| JVeVISGfnsbbin/lNFHivcegGPSXZ3sCcRg5SJCqAtg= | 798 | 0.1% |
| UAgfgEOFlz0LaIvvhA8aeBUTsZo8DRe2vbzTD4VMdy0= | 789 | 0.1% |
| mHgz4OqC95vIpWu8ha556jGuJ9+RjvMUNK3CpC7FXVw= | 776 | 0.1% |
| Other values (7691) | 637891 |
Length
| Value | Count | Frequency (%) |
| 5xpzywwo6vs4quhppbchcyb2bxnzuv1ke6ethb6rvuk | 1760 | 0.3% |
| msxpqv+xlq4cpsl1lbcittiudzn0ntykspnk/a2cpgo | 1745 | 0.3% |
| spwam3nbr088n0fsc+5uqqxld9t4lvttcdf0/2epovi | 1063 | 0.2% |
| qbidxzkp79jv7uomwt5atdmdocrdeykyb9r52vtfchi | 961 | 0.1% |
| 4o/uhzhzvybdjfouehm+eg0q/auc2buecmfsdjlyuqy | 885 | 0.1% |
| xtw2tumdocwvblzfktb8py6pltvy078avb9juxayc4o | 851 | 0.1% |
| tlqijpbcflgwlvfrg3d4le4tx180mddzu35xb8qbxko | 803 | 0.1% |
| jvevisgfnsbbin/lnfhivceggpsxz3sccrg5sjcqatg | 798 | 0.1% |
| uagfgeoflz0laivvha8aebutszo8dre2vbztd4vmdy0 | 789 | 0.1% |
| mhgz4oqc95vipwu8ha556jguj9+rjvmunk3cpc7fxvw | 776 | 0.1% |
| Other values (7691) | 637891 |
Most occurring characters
| Value | Count | Frequency (%) |
| = | 648322 | 2.3% |
| k | 484634 | 1.7% |
| o | 484209 | 1.7% |
| c | 478990 | 1.7% |
| M | 471076 | 1.7% |
| 8 | 470683 | 1.7% |
| Y | 469337 | 1.6% |
| g | 468794 | 1.6% |
| A | 467568 | 1.6% |
| Q | 466577 | 1.6% |
| Other values (55) | 23615978 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 11332699 | |
| Uppercase Letter | 11311461 | |
| Decimal Number | 4390350 | 15.4% |
| Math Symbol | 1071942 | 3.8% |
| Other Punctuation | 419716 | 1.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| k | 484634 | 4.3% |
| o | 484209 | 4.3% |
| c | 478990 | 4.2% |
| g | 468794 | 4.1% |
| w | 462744 | 4.1% |
| s | 460641 | 4.1% |
| d | 439836 | 3.9% |
| x | 436840 | 3.9% |
| p | 433963 | 3.8% |
| b | 431867 | 3.8% |
| Other values (16) | 6750181 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 471076 | 4.2% |
| Y | 469337 | 4.1% |
| A | 467568 | 4.1% |
| Q | 466577 | 4.1% |
| I | 466196 | 4.1% |
| U | 460383 | 4.1% |
| E | 451199 | 4.0% |
| B | 436215 | 3.9% |
| O | 431010 | 3.8% |
| R | 430905 | 3.8% |
| Other values (16) | 6760995 |
Decimal Number
| Value | Count | Frequency (%) |
| 8 | 470683 | |
| 4 | 462984 | |
| 0 | 461026 | |
| 5 | 437044 | |
| 7 | 433536 | |
| 2 | 429306 | |
| 3 | 427356 | |
| 1 | 423494 | |
| 6 | 423001 | |
| 9 | 421920 |
Math Symbol
| Value | Count | Frequency (%) |
| = | 648322 | |
| + | 423620 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 419716 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 22644160 | |
| Common | 5882008 | 20.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| k | 484634 | 2.1% |
| o | 484209 | 2.1% |
| c | 478990 | 2.1% |
| M | 471076 | 2.1% |
| Y | 469337 | 2.1% |
| g | 468794 | 2.1% |
| A | 467568 | 2.1% |
| Q | 466577 | 2.1% |
| I | 466196 | 2.1% |
| w | 462744 | 2.0% |
| Other values (42) | 17924035 |
Common
| Value | Count | Frequency (%) |
| = | 648322 | |
| 8 | 470683 | 8.0% |
| 4 | 462984 | 7.9% |
| 0 | 461026 | 7.8% |
| 5 | 437044 | 7.4% |
| 7 | 433536 | 7.4% |
| 2 | 429306 | 7.3% |
| 3 | 427356 | 7.3% |
| + | 423620 | 7.2% |
| 1 | 423494 | 7.2% |
| Other values (3) | 1264637 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 28526168 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| = | 648322 | 2.3% |
| k | 484634 | 1.7% |
| o | 484209 | 1.7% |
| c | 478990 | 1.7% |
| M | 471076 | 1.7% |
| 8 | 470683 | 1.7% |
| Y | 469337 | 1.6% |
| g | 468794 | 1.6% |
| A | 467568 | 1.6% |
| Q | 466577 | 1.6% |
| Other values (55) | 23615978 |
| Distinct | 13 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8014.678691 |
| Minimum | 8011 |
|---|---|
| Maximum | 8023 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.9 MiB |
Quantile statistics
| Minimum | 8011 |
|---|---|
| 5-th percentile | 8011 |
| Q1 | 8012 |
| median | 8014 |
| Q3 | 8016 |
| 95-th percentile | 8022 |
| Maximum | 8023 |
| Range | 12 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 3.256659585 |
|---|---|
| Coefficient of variation (CV) | 0.0004063368864 |
| Kurtosis | -0.07925653742 |
| Mean | 8014.678691 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.9925711294 |
| Sum | 5196092518 |
| Variance | 10.60583165 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 8013 | 119420 | |
| 8014 | 97738 | |
| 8012 | 96049 | |
| 8011 | 86088 | |
| 8015 | 76746 | |
| 8022 | 29981 | 4.6% |
| 8021 | 28988 | 4.5% |
| 8019 | 22698 | 3.5% |
| 8020 | 22221 | 3.4% |
| 8018 | 22218 | 3.4% |
| Other values (3) | 46175 | 7.1% |
| Value | Count | Frequency (%) |
| 8011 | 86088 | |
| 8012 | 96049 | |
| 8013 | 119420 | |
| 8014 | 97738 | |
| 8015 | 76746 | |
| 8016 | 21281 | 3.3% |
| 8017 | 18669 | 2.9% |
| 8018 | 22218 | 3.4% |
| 8019 | 22698 | 3.5% |
| 8020 | 22221 | 3.4% |
| Value | Count | Frequency (%) |
| 8023 | 6225 | 1.0% |
| 8022 | 29981 | 4.6% |
| 8021 | 28988 | 4.5% |
| 8020 | 22221 | 3.4% |
| 8019 | 22698 | 3.5% |
| 8018 | 22218 | 3.4% |
| 8017 | 18669 | 2.9% |
| 8016 | 21281 | 3.3% |
| 8015 | 76746 | |
| 8014 | 97738 |
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 4.9 MiB |
| 1025 | |
|---|---|
| 1030 | |
| 1021 | 38159 |
| 1040 | 10848 |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Characters and Unicode
| Total characters | 2593288 |
|---|---|
| Distinct characters | 6 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1025 |
|---|---|
| 2nd row | 1025 |
| 3rd row | 1025 |
| 4th row | 1025 |
| 5th row | 1025 |
Common Values
| Value | Count | Frequency (%) |
| 1025 | 487396 | |
| 1030 | 111919 | 17.3% |
| 1021 | 38159 | 5.9% |
| 1040 | 10848 | 1.7% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 1025 | 487396 | |
| 1030 | 111919 | 17.3% |
| 1021 | 38159 | 5.9% |
| 1040 | 10848 | 1.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 771089 | |
| 1 | 686481 | |
| 2 | 525555 | |
| 5 | 487396 | |
| 3 | 111919 | 4.3% |
| 4 | 10848 | 0.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 2593288 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 771089 | |
| 1 | 686481 | |
| 2 | 525555 | |
| 5 | 487396 | |
| 3 | 111919 | 4.3% |
| 4 | 10848 | 0.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 2593288 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 771089 | |
| 1 | 686481 | |
| 2 | 525555 | |
| 5 | 487396 | |
| 3 | 111919 | 4.3% |
| 4 | 10848 | 0.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2593288 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 771089 | |
| 1 | 686481 | |
| 2 | 525555 | |
| 5 | 487396 | |
| 3 | 111919 | 4.3% |
| 4 | 10848 | 0.4% |
| Distinct | 17 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.415073066 |
| Minimum | 1 |
|---|---|
| Maximum | 31 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.9 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 4 |
| median | 5 |
| Q3 | 7 |
| 95-th percentile | 9 |
| Maximum | 31 |
| Range | 30 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.748693059 |
|---|---|
| Coefficient of variation (CV) | 0.5076003639 |
| Kurtosis | 29.69035317 |
| Mean | 5.415073066 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 3.984673665 |
| Sum | 3510711 |
| Variance | 7.55531353 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4 | 157117 | |
| 5 | 124614 | |
| 6 | 87722 | |
| 7 | 79013 | |
| 3 | 78392 | |
| 8 | 46285 | 7.1% |
| 2 | 30502 | 4.7% |
| 9 | 20682 | 3.2% |
| 10 | 5147 | 0.8% |
| 15 | 3760 | 0.6% |
| Other values (7) | 15088 | 2.3% |
| Value | Count | Frequency (%) |
| 1 | 2330 | 0.4% |
| 2 | 30502 | 4.7% |
| 3 | 78392 | |
| 4 | 157117 | |
| 5 | 124614 | |
| 6 | 87722 | |
| 7 | 79013 | |
| 8 | 46285 | 7.1% |
| 9 | 20682 | 3.2% |
| 10 | 5147 | 0.8% |
| Value | Count | Frequency (%) |
| 31 | 1745 | 0.3% |
| 27 | 1760 | 0.3% |
| 16 | 1298 | 0.2% |
| 15 | 3760 | 0.6% |
| 13 | 2413 | 0.4% |
| 12 | 2238 | 0.3% |
| 11 | 3304 | 0.5% |
| 10 | 5147 | 0.8% |
| 9 | 20682 | |
| 8 | 46285 |
| Distinct | 69 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.29786588 |
| Minimum | 1 |
|---|---|
| Maximum | 186 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.9 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 6 |
| median | 9 |
| Q3 | 14 |
| 95-th percentile | 36 |
| Maximum | 186 |
| Range | 185 |
| Interquartile range (IQR) | 8 |
Descriptive statistics
| Standard deviation | 17.31660912 |
|---|---|
| Coefficient of variation (CV) | 1.302209639 |
| Kurtosis | 45.77917571 |
| Mean | 13.29786588 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | 5.724694582 |
| Sum | 8621299 |
| Variance | 299.8649515 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 8 | 56955 | 8.8% |
| 6 | 50585 | 7.8% |
| 1 | 44583 | 6.9% |
| 12 | 43187 | 6.7% |
| 7 | 41821 | 6.5% |
| 10 | 37651 | 5.8% |
| 4 | 34457 | 5.3% |
| 3 | 33540 | 5.2% |
| 9 | 30832 | 4.8% |
| 11 | 29485 | 4.5% |
| Other values (59) | 245226 |
| Value | Count | Frequency (%) |
| 1 | 44583 | |
| 2 | 19897 | 3.1% |
| 3 | 33540 | |
| 4 | 34457 | |
| 5 | 20812 | 3.2% |
| 6 | 50585 | |
| 7 | 41821 | |
| 8 | 56955 | |
| 9 | 30832 | |
| 10 | 37651 |
| Value | Count | Frequency (%) |
| 186 | 1745 | |
| 179 | 803 | |
| 145 | 1760 | |
| 105 | 103 | < 0.1% |
| 89 | 688 | 0.1% |
| 86 | 682 | 0.1% |
| 85 | 695 | 0.1% |
| 83 | 644 | 0.1% |
| 81 | 616 | 0.1% |
| 72 | 702 | 0.1% |
| Distinct | 106 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2665736.1 |
| Minimum | 2658300 |
|---|---|
| Maximum | 2669800 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.9 MiB |
Quantile statistics
| Minimum | 2658300 |
|---|---|
| 5-th percentile | 2662600 |
| Q1 | 2664500 |
| median | 2665900 |
| Q3 | 2666700 |
| 95-th percentile | 2669000 |
| Maximum | 2669800 |
| Range | 11500 |
| Interquartile range (IQR) | 2200 |
Descriptive statistics
| Standard deviation | 1764.035681 |
|---|---|
| Coefficient of variation (CV) | 0.0006617443046 |
| Kurtosis | -0.005506112717 |
| Mean | 2665736.1 |
| Median Absolute Deviation (MAD) | 1000 |
| Skewness | -0.1004844608 |
| Sum | 1.72825536 × 1012 |
| Variance | 3111821.885 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2666000 | 30873 | 4.8% |
| 2665900 | 29953 | 4.6% |
| 2666100 | 26315 | 4.1% |
| 2665800 | 23428 | 3.6% |
| 2665700 | 21074 | 3.3% |
| 2665500 | 19512 | 3.0% |
| 2665600 | 18510 | 2.9% |
| 2666500 | 18132 | 2.8% |
| 2666300 | 16700 | 2.6% |
| 2666400 | 16700 | 2.6% |
| Other values (96) | 427125 |
| Value | Count | Frequency (%) |
| 2658300 | 49 | < 0.1% |
| 2658500 | 104 | |
| 2658600 | 32 | < 0.1% |
| 2658900 | 90 | |
| 2659000 | 51 | < 0.1% |
| 2659100 | 10 | < 0.1% |
| 2659400 | 16 | < 0.1% |
| 2659500 | 81 | |
| 2659600 | 145 | |
| 2659700 | 69 |
| Value | Count | Frequency (%) |
| 2669800 | 1539 | 0.2% |
| 2669700 | 2675 | |
| 2669600 | 1983 | 0.3% |
| 2669500 | 2455 | 0.4% |
| 2669400 | 3587 | |
| 2669300 | 5979 | |
| 2669200 | 5550 | |
| 2669100 | 6337 | |
| 2669000 | 6591 | |
| 2668900 | 5998 |
| Distinct | 54 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1211418.419 |
| Minimum | 1208800 |
|---|---|
| Maximum | 1214400 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.9 MiB |
Quantile statistics
| Minimum | 1208800 |
|---|---|
| 5-th percentile | 1209700 |
| Q1 | 1210600 |
| median | 1211600 |
| Q3 | 1212200 |
| 95-th percentile | 1213000 |
| Maximum | 1214400 |
| Range | 5600 |
| Interquartile range (IQR) | 1600 |
Descriptive statistics
| Standard deviation | 999.7456894 |
|---|---|
| Coefficient of variation (CV) | 0.0008252686882 |
| Kurtosis | -0.897598571 |
| Mean | 1211418.419 |
| Median Absolute Deviation (MAD) | 800 |
| Skewness | -0.1470860595 |
| Sum | 7.853892122 × 1011 |
| Variance | 999491.4435 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1211600 | 30228 | 4.7% |
| 1212000 | 28262 | 4.4% |
| 1211800 | 28019 | 4.3% |
| 1211700 | 25163 | 3.9% |
| 1211900 | 24259 | 3.7% |
| 1212100 | 23056 | 3.6% |
| 1212400 | 22272 | 3.4% |
| 1212500 | 21188 | 3.3% |
| 1211300 | 20316 | 3.1% |
| 1212300 | 20120 | 3.1% |
| Other values (44) | 405439 |
| Value | Count | Frequency (%) |
| 1208800 | 7 | < 0.1% |
| 1209200 | 16 | < 0.1% |
| 1209300 | 3686 | 0.6% |
| 1209400 | 2815 | 0.4% |
| 1209500 | 5115 | 0.8% |
| 1209600 | 9486 | |
| 1209700 | 12465 | |
| 1209800 | 11368 | |
| 1209900 | 15225 | |
| 1210000 | 15449 |
| Value | Count | Frequency (%) |
| 1214400 | 37 | < 0.1% |
| 1214300 | 57 | < 0.1% |
| 1214200 | 92 | < 0.1% |
| 1214100 | 16 | < 0.1% |
| 1214000 | 72 | < 0.1% |
| 1213900 | 40 | < 0.1% |
| 1213800 | 51 | < 0.1% |
| 1213700 | 202 | < 0.1% |
| 1213600 | 306 | < 0.1% |
| 1213500 | 1291 |
| Distinct | 186 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.784693409 |
| Minimum | 1 |
|---|---|
| Maximum | 186 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.9 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 5 |
| Q3 | 9 |
| 95-th percentile | 24 |
| Maximum | 186 |
| Range | 185 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 11.61896443 |
|---|---|
| Coefficient of variation (CV) | 1.49253976 |
| Kurtosis | 59.6049167 |
| Mean | 7.784693409 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 6.267501926 |
| Sum | 5046988 |
| Variance | 135.0003344 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 113790 | |
| 2 | 76111 | |
| 3 | 66539 | |
| 4 | 56140 | |
| 5 | 48451 | 7.5% |
| 6 | 44444 | 6.9% |
| 7 | 35839 | 5.5% |
| 8 | 30824 | 4.8% |
| 9 | 24708 | 3.8% |
| 10 | 20722 | 3.2% |
| Other values (176) | 130754 |
| Value | Count | Frequency (%) |
| 1 | 113790 | |
| 2 | 76111 | |
| 3 | 66539 | |
| 4 | 56140 | |
| 5 | 48451 | |
| 6 | 44444 | 6.9% |
| 7 | 35839 | 5.5% |
| 8 | 30824 | 4.8% |
| 9 | 24708 | 3.8% |
| 10 | 20722 | 3.2% |
| Value | Count | Frequency (%) |
| 186 | 10 | |
| 185 | 13 | |
| 184 | 15 | |
| 183 | 7 | < 0.1% |
| 182 | 8 | |
| 181 | 7 | < 0.1% |
| 180 | 8 | |
| 179 | 8 | |
| 178 | 5 | < 0.1% |
| 177 | 19 |
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 4.9 MiB |
| 3-4 | |
|---|---|
| 5-6 | |
| 1-2 | |
| >6 | 15430 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 2.976200098 |
| Min length | 2 |
Characters and Unicode
| Total characters | 1929536 |
|---|---|
| Distinct characters | 8 |
| Distinct categories | 3 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 3-4 |
|---|---|
| 2nd row | 3-4 |
| 3rd row | 3-4 |
| 4th row | 3-4 |
| 5th row | 3-4 |
Common Values
| Value | Count | Frequency (%) |
| 3-4 | 428701 | |
| 5-6 | 117498 | 18.1% |
| 1-2 | 86693 | 13.4% |
| >6 | 15430 | 2.4% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 3-4 | 428701 | |
| 5-6 | 117498 | 18.1% |
| 1-2 | 86693 | 13.4% |
| 6 | 15430 | 2.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 632892 | |
| 3 | 428701 | |
| 4 | 428701 | |
| 6 | 132928 | 6.9% |
| 5 | 117498 | 6.1% |
| 1 | 86693 | 4.5% |
| 2 | 86693 | 4.5% |
| > | 15430 | 0.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1281214 | |
| Dash Punctuation | 632892 | |
| Math Symbol | 15430 | 0.8% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 3 | 428701 | |
| 4 | 428701 | |
| 6 | 132928 | 10.4% |
| 5 | 117498 | 9.2% |
| 1 | 86693 | 6.8% |
| 2 | 86693 | 6.8% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 632892 |
Math Symbol
| Value | Count | Frequency (%) |
| > | 15430 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1929536 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| - | 632892 | |
| 3 | 428701 | |
| 4 | 428701 | |
| 6 | 132928 | 6.9% |
| 5 | 117498 | 6.1% |
| 1 | 86693 | 4.5% |
| 2 | 86693 | 4.5% |
| > | 15430 | 0.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1929536 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 632892 | |
| 3 | 428701 | |
| 4 | 428701 | |
| 6 | 132928 | 6.9% |
| 5 | 117498 | 6.1% |
| 1 | 86693 | 4.5% |
| 2 | 86693 | 4.5% |
| > | 15430 | 0.8% |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 4.9 MiB |
| 70-99 | |
|---|---|
| 100-149 | |
| 50-69 | |
| >150 | |
| <50 |
Length
| Max length | 7 |
|---|---|
| Median length | 5 |
| Mean length | 5.377897094 |
| Min length | 3 |
Characters and Unicode
| Total characters | 3486609 |
|---|---|
| Distinct characters | 10 |
| Distinct categories | 3 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 70-99 |
|---|---|
| 2nd row | 70-99 |
| 3rd row | 100-149 |
| 4th row | 100-149 |
| 5th row | 100-149 |
Common Values
| Value | Count | Frequency (%) |
| 70-99 | 245531 | |
| 100-149 | 202540 | |
| 50-69 | 87381 | 13.5% |
| >150 | 65659 | 10.1% |
| <50 | 47211 | 7.3% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 70-99 | 245531 | |
| 100-149 | 202540 | |
| 50-69 | 87381 | 13.5% |
| 150 | 65659 | 10.1% |
| 50 | 47211 | 7.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 850862 | |
| 9 | 780983 | |
| - | 535452 | |
| 1 | 470739 | |
| 7 | 245531 | 7.0% |
| 4 | 202540 | 5.8% |
| 5 | 200251 | 5.7% |
| 6 | 87381 | 2.5% |
| > | 65659 | 1.9% |
| < | 47211 | 1.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 2838287 | |
| Dash Punctuation | 535452 | 15.4% |
| Math Symbol | 112870 | 3.2% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 850862 | |
| 9 | 780983 | |
| 1 | 470739 | |
| 7 | 245531 | 8.7% |
| 4 | 202540 | 7.1% |
| 5 | 200251 | 7.1% |
| 6 | 87381 | 3.1% |
Math Symbol
| Value | Count | Frequency (%) |
| > | 65659 | |
| < | 47211 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 535452 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 3486609 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 850862 | |
| 9 | 780983 | |
| - | 535452 | |
| 1 | 470739 | |
| 7 | 245531 | 7.0% |
| 4 | 202540 | 5.8% |
| 5 | 200251 | 5.7% |
| 6 | 87381 | 2.5% |
| > | 65659 | 1.9% |
| < | 47211 | 1.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3486609 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 850862 | |
| 9 | 780983 | |
| - | 535452 | |
| 1 | 470739 | |
| 7 | 245531 | 7.0% |
| 4 | 202540 | 5.8% |
| 5 | 200251 | 5.7% |
| 6 | 87381 | 2.5% |
| > | 65659 | 1.9% |
| < | 47211 | 1.4% |
| Distinct | 333792 |
|---|---|
| Distinct (%) | 51.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.016500895 × 1014 |
| Minimum | 2.0121061 × 1014 |
|---|---|
| Maximum | 2.020106138 × 1014 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.9 MiB |
Quantile statistics
| Minimum | 2.0121061 × 1014 |
|---|---|
| 5-th percentile | 2.0121061 × 1014 |
| Q1 | 2.0151061 × 1014 |
| median | 2.017106112 × 1014 |
| Q3 | 2.019106112 × 1014 |
| 95-th percentile | 2.020106121 × 1014 |
| Maximum | 2.020106138 × 1014 |
| Range | 8.000038074 × 1011 |
| Interquartile range (IQR) | 4.000012078 × 1011 |
Descriptive statistics
| Standard deviation | 2.491507909 × 1011 |
|---|---|
| Coefficient of variation (CV) | 0.001235560032 |
| Kurtosis | -0.9710262213 |
| Mean | 2.016500895 × 1014 |
| Median Absolute Deviation (MAD) | 2.000011704 × 1011 |
| Skewness | -0.2411128973 |
| Sum | 1.606980786 × 1018 |
| Variance | 6.207611663 × 1022 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2.019106118 × 1014 | 13 | < 0.1% |
| 2.0141061 × 1014 | 12 | < 0.1% |
| 2.01910612 × 1014 | 12 | < 0.1% |
| 2.0121061 × 1014 | 12 | < 0.1% |
| 2.020106116 × 1014 | 11 | < 0.1% |
| 2.02010612 × 1014 | 11 | < 0.1% |
| 2.0141061 × 1014 | 11 | < 0.1% |
| 2.0151061 × 1014 | 11 | < 0.1% |
| 2.016106121 × 1014 | 11 | < 0.1% |
| 2.0151061 × 1014 | 11 | < 0.1% |
| Other values (333782) | 648207 |
| Value | Count | Frequency (%) |
| 2.0121061 × 1014 | 2 | |
| 2.0121061 × 1014 | 4 | |
| 2.0121061 × 1014 | 3 | |
| 2.0121061 × 1014 | 3 | |
| 2.0121061 × 1014 | 1 | < 0.1% |
| 2.0121061 × 1014 | 3 | |
| 2.0121061 × 1014 | 2 | |
| 2.0121061 × 1014 | 3 | |
| 2.0121061 × 1014 | 2 | |
| 2.0121061 × 1014 | 4 |
| Value | Count | Frequency (%) |
| 2.020106138 × 1014 | 1 | < 0.1% |
| 2.020106138 × 1014 | 2 | < 0.1% |
| 2.020106138 × 1014 | 2 | < 0.1% |
| 2.020106138 × 1014 | 1 | < 0.1% |
| 2.020106138 × 1014 | 2 | < 0.1% |
| 2.020106138 × 1014 | 2 | < 0.1% |
| 2.020106138 × 1014 | 1 | < 0.1% |
| 2.020106138 × 1014 | 3 | |
| 2.020106138 × 1014 | 1 | < 0.1% |
| 2.020106138 × 1014 | 5 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| reportingmunicipalityid | statyear | personpseudoid | ageclass | nationalityclass | populationtype | maritalstatusclass | arrivalyearmunicipality | arrivalyearswitzerland | egid | gbaups | gkats | gastws | gazwot | eh | nh | ewid | wazimclass | wareaclass | householdid | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1061 | 2012 | 201240015276381 | 30 | Switzerland | 1 | 1 | 9997 | -5.0 | M9+DpAhjwnQRpDeeMxtGpPzNcu3BmCRYu+f6PPDpcFw= | 8014 | 1025 | 11 | 43 | 2664100 | 1211900 | 3 | 3-4 | 70-99 | 201210610007544 |
| 1 | 1061 | 2012 | 201240008394617 | 60 | Central Europe | 1 | 2 | 1978 | 1978.0 | M9+DpAhjwnQRpDeeMxtGpPzNcu3BmCRYu+f6PPDpcFw= | 8014 | 1025 | 11 | 43 | 2664100 | 1211900 | 3 | 3-4 | 70-99 | 201210610007544 |
| 2 | 1061 | 2012 | 201240008530748 | 65 | Southeastern Europe | 1 | 2 | 1997 | 1992.0 | 5ksXQiLBDK0QW/BfKkShbFkU6VpevtuxJnkCr9PImAc= | 8011 | 1025 | 8 | 27 | 2664300 | 1212000 | 2 | 3-4 | 100-149 | 201210610007311 |
| 3 | 1061 | 2012 | 201240007838074 | 10 | Southeastern Europe | 1 | 1 | 9997 | -5.0 | 5ksXQiLBDK0QW/BfKkShbFkU6VpevtuxJnkCr9PImAc= | 8011 | 1025 | 8 | 27 | 2664300 | 1212000 | 2 | 3-4 | 100-149 | 201210610007311 |
| 4 | 1061 | 2012 | 201240009196289 | 15 | Southeastern Europe | 1 | 1 | 9997 | -5.0 | 5ksXQiLBDK0QW/BfKkShbFkU6VpevtuxJnkCr9PImAc= | 8011 | 1025 | 8 | 27 | 2664300 | 1212000 | 2 | 3-4 | 100-149 | 201210610007311 |
| 5 | 1061 | 2012 | 201240015693264 | 35 | Southeastern Europe | 1 | 2 | 1997 | 1992.0 | 5ksXQiLBDK0QW/BfKkShbFkU6VpevtuxJnkCr9PImAc= | 8011 | 1025 | 8 | 27 | 2664300 | 1212000 | 2 | 3-4 | 100-149 | 201210610007311 |
| 6 | 1061 | 2012 | 201240012665547 | 10 | Southeastern Europe | 1 | 1 | 9997 | -5.0 | 5ksXQiLBDK0QW/BfKkShbFkU6VpevtuxJnkCr9PImAc= | 8011 | 1025 | 8 | 27 | 2664300 | 1212000 | 2 | 3-4 | 100-149 | 201210610007311 |
| 7 | 1061 | 2012 | 201240009727178 | 60 | Southeastern Europe | 1 | 2 | 1997 | 1990.0 | 5ksXQiLBDK0QW/BfKkShbFkU6VpevtuxJnkCr9PImAc= | 8011 | 1025 | 8 | 27 | 2664300 | 1212000 | 2 | 3-4 | 100-149 | 201210610007311 |
| 8 | 1061 | 2012 | 201240009651923 | 35 | Southeastern Europe | 1 | 2 | 1997 | 1991.0 | 5ksXQiLBDK0QW/BfKkShbFkU6VpevtuxJnkCr9PImAc= | 8011 | 1025 | 8 | 27 | 2664300 | 1212000 | 2 | 3-4 | 100-149 | 201210610007311 |
| 9 | 1061 | 2012 | 201240012435239 | 7 | Switzerland | 1 | 1 | 9997 | -5.0 | 5ksXQiLBDK0QW/BfKkShbFkU6VpevtuxJnkCr9PImAc= | 8011 | 1025 | 8 | 27 | 2664300 | 1212000 | 16 | 3-4 | 100-149 | 201210610007308 |
Last rows
| reportingmunicipalityid | statyear | personpseudoid | ageclass | nationalityclass | populationtype | maritalstatusclass | arrivalyearmunicipality | arrivalyearswitzerland | egid | gbaups | gkats | gastws | gazwot | eh | nh | ewid | wazimclass | wareaclass | householdid | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 648312 | 1061 | 2020 | 202040008787727 | 40 | Switzerland | 1 | 2 | 2010 | NaN | mXJLPJGSb0tpUTG9WaASo02TE6pNfUgPctdBRxmrgfQ= | 8020 | 1025 | 3 | 7 | 2667100 | 1210100 | 4 | 3-4 | 100-149 | 202010611192378 |
| 648313 | 1061 | 2020 | 202040012662047 | 60 | Other | 1 | 2 | 2001 | 1986.0 | 39dfXfX6JFTCHXqDfbowd1vZr1otQOBy43ZJP4KFYOQ= | 8014 | 1025 | 5 | 8 | 2662700 | 1211400 | 7 | 3-4 | 70-99 | 202010611986397 |
| 648314 | 1061 | 2020 | 202040010178319 | 1 | Southern Europe | 1 | 1 | 9997 | -5.0 | mHgz4OqC95vIpWu8ha556jGuJ9+RjvMUNK3CpC7FXVw= | 8014 | 1025 | 7 | 36 | 2662200 | 1211100 | 11 | 3-4 | 70-99 | 202010611985166 |
| 648315 | 1061 | 2020 | 202040004662844 | 30 | Southern Europe | 1 | 2 | 2015 | 2015.0 | mHgz4OqC95vIpWu8ha556jGuJ9+RjvMUNK3CpC7FXVw= | 8014 | 1025 | 7 | 36 | 2662200 | 1211100 | 11 | 3-4 | 70-99 | 202010611985166 |
| 648316 | 1061 | 2020 | 202040009605936 | 60 | Switzerland | 1 | 2 | 1983 | NaN | +v7aZSHghkrwavngiP7YtMCM4SVdcXTVnibQwVqDW0w= | 8021 | 1025 | 7 | 16 | 2666600 | 1210700 | 17 | 3-4 | 100-149 | 202010611209123 |
| 648317 | 1061 | 2020 | 202040005547751 | 20 | Switzerland | 1 | 1 | 9997 | -5.0 | +v7aZSHghkrwavngiP7YtMCM4SVdcXTVnibQwVqDW0w= | 8021 | 1025 | 7 | 16 | 2666600 | 1210700 | 17 | 3-4 | 100-149 | 202010611209123 |
| 648318 | 1061 | 2020 | 202040004692795 | 60 | Switzerland | 1 | 2 | 1966 | NaN | +v7aZSHghkrwavngiP7YtMCM4SVdcXTVnibQwVqDW0w= | 8021 | 1025 | 7 | 16 | 2666600 | 1210700 | 17 | 3-4 | 100-149 | 202010611209123 |
| 648319 | 1061 | 2020 | 202040004207359 | 7 | Switzerland | 1 | 1 | 9997 | -5.0 | mXJLPJGSb0tpUTG9WaASo02TE6pNfUgPctdBRxmrgfQ= | 8020 | 1025 | 3 | 7 | 2667100 | 1210100 | 4 | 3-4 | 100-149 | 202010611192378 |
| 648320 | 1061 | 2020 | 202040006045509 | 40 | Switzerland | 1 | 2 | 2002 | 2002.0 | +v7aZSHghkrwavngiP7YtMCM4SVdcXTVnibQwVqDW0w= | 8021 | 1025 | 7 | 16 | 2666600 | 1210700 | 2 | 3-4 | 70-99 | 202010611209124 |
| 648321 | 1061 | 2020 | 202040007053657 | 50 | Switzerland | 1 | 2 | 1990 | 1990.0 | +v7aZSHghkrwavngiP7YtMCM4SVdcXTVnibQwVqDW0w= | 8021 | 1025 | 7 | 16 | 2666600 | 1210700 | 2 | 3-4 | 70-99 | 202010611209124 |